Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define extrema using mapreduce; support init #36265

Closed
wants to merge 15 commits into from
Closed

Conversation

tkf
Copy link
Member

@tkf tkf commented Jun 13, 2020

Like #36188, but now for extrema.

This patch also switches the implementation of extrema to use mapreduce. It improves the performance even for simple arrays:

julia> @btime extrema($(randn(1000)));
  5.502 μs (0 allocations: 0 bytes)      # before (1.6.0-DEV.220)
  2.988 μs (0 allocations: 0 bytes)      # after (7b0e72efc8f7fbf2c0bd5d30a8b6e2e22288547f)

The improvement is much more drastic when mixing it with iterator transforms because it hits the transducer path.

julia> @btime extrema($(Iterators.flatten(((0, 1, 2), randn(1000)))));
  195.393 μs (5489 allocations: 133.03 KiB)  # before (1.6.0-DEV.220)
  3.469 μs (0 allocations: 0 bytes)          # after (7b0e72efc8f7fbf2c0bd5d30a8b6e2e22288547f)

Of course, for arrays, we might want to use the same unrolling trick as in minimum/maximum. But that's tracked by #31442 and can be done later. I think this PR would be a good generic fallback that is useful as a baseline for #31442.

One thing not so great about this PR is that I had to copy the old iterate-only extrema implementation to compiler.jl as apparently it is required for bootstrap. I think it might be nice to separate out a very minimal portion of reduce.jl so that it can be included from the compiler. But that's a bit too large refactoring to do in this PR.

@tkf tkf requested review from timholy and nalimilan June 13, 2020 05:13
@tkf tkf added the domain:fold sum, maximum, reduce, foldl, etc. label Jun 13, 2020
Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few superficial comments.

base/multidimensional.jl Outdated Show resolved Hide resolved
Comment on lines -495 to -557
extrema(x::Real) = (x, x)
extrema(f, x::Real) = (y = f(x); (y, y))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these still useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mapreduce has a specialization for Number:

mapreduce(f, op, a::Number) = mapreduce_first(f, op, a)

So, I think the compiler will generate the equivalent machine code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Maybe worth checking just in case?

base/multidimensional.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/reduce.jl Outdated Show resolved Hide resolved
base/compiler/compiler.jl Show resolved Hide resolved
Copy link
Sponsor Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Needing a special version in compiler isn't so bad, IMO.

base/reduce.jl Outdated Show resolved Hide resolved
@simeonschaub
Copy link
Member

Bump. Looks like this just needs to be rebased.

@Moelf
Copy link
Sponsor Contributor

Moelf commented Aug 21, 2021

bump, hitting extrema slower than (maximum, minimum) again recently

@tkf tkf added the status:merge me PR is reviewed. Merge when all tests are passing label Nov 13, 2021
@DilumAluthge
Copy link
Member

@tkf The CI failures seem related to the contents of the PR.

@DilumAluthge DilumAluthge removed the status:merge me PR is reviewed. Merge when all tests are passing label Nov 13, 2021
@tkf
Copy link
Member Author

tkf commented Nov 13, 2021

This passed the tests for SparseArrays locally but I noticed that it's not hitting the right method. Unfortunately, the build process ignores missing name on import and happy define HigherOrderFns._extrema_itr

Profile  ──────────  0.286842 seconds
WARNING: could not import Base._extrema_itr into HigherOrderFns
SparseArrays  ─────  2.713728 seconds

We need some fix in SparseArrays to make it reasonably correct and practically OK. (100% correctness is impossible with the current facility in Base because SparseArrays seem to want to assume max and min are commutative. We need to revive #36424 for this.)

@tkf tkf marked this pull request as draft November 13, 2021 05:15
@JeffBezanson JeffBezanson added the performance Must go faster label Jan 5, 2022
@tkf
Copy link
Member Author

tkf commented Jan 26, 2022

#43604 took care of this

@tkf tkf closed this Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:fold sum, maximum, reduce, foldl, etc. performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants